Constitute: The world's constitutions to read, search, and compare

نویسندگان

  • Zachary Elkins
  • Tom Ginsburg
  • James Melton
  • Robert Shaffer
  • Juan Sequeda
  • Daniel P. Miranker
چکیده

A constitution forms the foundation of the virtually all governments around the world. A surprisingly large number of constitutions change each year. On average, 30 constitutions are amended and 5 are completely replaced each year. Despite this level of change, no country changes its constitution often enough for the country’s officials to gain much experience as constitutional drafters. In order to address constitutional drafters need for systematic information a web portal, Constitute, has been created using semantic technologies. The portal provides searchable access to the work of the Comparative Constitutions Project. The Comparative Constitution Project has amassed on-line copies of over 700 constitutions dating from the present day back to 1789. Further, the project created an OWL onotology containing over 600 constitutional concepts, e.g. women’s rights, and is well into a process to tag the text of their entire collection of constitutions. To build the Constitute portal, the Comparative Constituion database was converted to an RDF representation using R2RML mappings and Capsenta’s Ultrawrap. The portal implements semantic search features such that constitution drafters may search, read and compare constitutions. Constitute is available at http://www.constituteproject.org 1 The Problem: Drafting New Constitutions Constitutions empower, and limit the institutions that govern society. In doing so, they are intimately linked to the provision of public goods. Outcomes, like democracy, economic performance and human rights protection, are all associated with the contents of countries constitutions. It is little wonder that constitutions are often blamed for poor economic and political outcomes or that such outcomes, or that those outcomes are followed by constitutional change. Both domestic and international actors view constitutional change as a means to spur economic, political and social development. Even as international events make news, few people are aware of the scope of constitutional change. On average, 30 constitutions are amended and 5 are completely replaced each year[3]. Despite this level of constitutional change, no country changes its constitution often enough that within its ranks of public officials are there people experienced in drafting constitutions. In fact, the most common scenario is, the people responsible for drafting constitutions or constitutional amendments have no prior experience in the matter, nor will they repeat their duties. Even basic background support, such as systematic information on the contents of other countries constitutions, and even previous constitutions in their own country, is lacking. Just the existence of such systematic information would form a basis for methodically approaching the most basic question of which topics should be addressed in a constitution. Access to corresponding text may serve to shape debate and enable more productive effort. In current practice, external advisors are frequently consulted. Even the most experienced advisors tend to rely on a small set of well-known models, and are only able to draw on anecdotal evidence when responding to specific questions. 2 The Solution: Constitute In 2005, Elkins and Ginsburg launched a large-scale data-collection project: the Comparative Constitutions Project (CCP). This project identified and acquired almost every constitutional text within each country‘s series of constitutional laws. Additionally, the CCP categorized the content of national constitutions original texts and subsequent amendments for all independent states from 1789 to present day. Each constitution is organized by sections. Each section, if applicable, has been tagged with topics which represent the interpretations by domain experts. For example, Article 32 of the Constitution of Angola refers to the topic of right to privacy. These texts, in electronic and searchable form, represent a unique and comprehensive repository of constitutional legislation. Given the scope and immediacy of the need, the first of what is anticipated to be scholarly and practical applications of the corpus, is Constitute (http://www.constituteproject.org). Constitute is a semantically enabled search portal, built using Semantic Web technologies. The primary purpose of Constitute is to address constitutional drafters need for systematic information on the contents of constitutions. Constitute allows users to access full texts of constitutions and excerpts from those texts on particular topics and geographic regions. For example, suppose a constitutional drafting committee wishes to review the constitutional language used on the topic of right to privacy, in a subset of constitutions written post WWII in Europe. Right to privacy is but one example; the set of constitutional topics forms an OWL ontology with over 400 concepts. The data for each constitution (i.e section name, content and topics) is represented in RDF. On average, each constitution consists of 9000 triples. The number of constitutions hosted in Constitute is steadily increasing. At the time of this writing it hosts over 160 constitutions. 1 http://comparativeconstitutionsproject.org/ 3 Constitute Architecture Domain experts created the Constitution Ontology in OWL which represents the taxonomical relationship between constitutional topics and geographic regions. Subsequently, domain experts cleaned the constitution data from CCP and represented in a tabular format. Mappings between the tabular data and the Constitution Ontology are represented in R2RML, which is then used to generate the RDF using Ultrawrap. Finally, the RDF data and and Constitution Ontology is used to create a search portal built on top of the Google App Engine. Figure 1 represents the architecture of Constitute. Fig. 1. Architecture of Constitute The Comparative Constitutions Project The Comparative Constitutions Project (CCP) has focused on the systematic collection and interpretation of constitutional text since 2005. Through an exhaustive and costly process, the CCP has collected, scanned, and, when needed, translated into English the text for 95% of documents in the universe of constitutional systems. To date, the CCP repository consists of includes 789 of the worlds 839 constitutional systems, and 2877 of the 3234 amendments to these systems. In the process of interpreting constitutions, the CCP tagged each section of a constitution (i.e Article 1), if applicable, with constitutional topics, such as right to privacy or separation of church and state. These topics come from a 669 question survey that CCP developed in order to interpret constitutions. The interpretations of each constitutions are represented as a set of tuples which include the constitution name, a short description of the topic, a topic code, and a numerical reference to an organization header. A typical entry might read [Albania 2008, Official religion, offrel, 10.1]. This means that Section 10.1 of the Albanian constitution of 2008 has been tagged with the topic official religion. Data Cleaning A team of political science and law graduate students (the domain experts) cleaned the CCP data. First, the domain expert selects a constitution, and downloads an uncorrected OCR scan of that constitution from the CCP repository. Subsequently the domain experts clean the document by fixing typos, errant line breaks, bad characters and formatting organizational headers (e.g. Chapter, Article, etc). Next, a Python script consisting of regular expressions merges the clean text with the tag data and generates a tabular representation of the data. The tabular data is represented in XLS so the domain experts could make use of the track change features. Around 10 domain experts in 9 months were able to clean and generate 180 constitution XLS files. Constitution Ontology A domain expert created the Constitution Ontology which consists of two main parts: Topics and Geography. The topics part consists of a taxonomical relationship of the constitutional topics from the CCP. For example, the topic “Freedom of Religion” is a sub class of “Religion” and “Civil and Political Rights”. Additionally, “Religion” is a subclass of “Culture and Identity” while “Civil and Political Rights” is a sub class of “Rights and Duties”. The geography part is an import and extension of the FAO Geopolitical Ontology. New geographic sub-regions were added, such as the Balkans, Middle East, etc. Additional missing synonyms names for a countries were added(i.e. Netherlands, Holland). Figure 2 depicts a portion of the Constitution Ontology. We are currently in the process of extending the ontology to include time eras (Years, Decades, etc). Generation RDF Each constitution XLS was loaded into a Microsoft SQL Server database, a table per constitution. By loading the XLS into a relational database, the mappings between the the constitution data and the Constitution Ontology could be represented in R2RML. R2RML, the Relational Database to RDF mapping language[2], in conjunction with the Direct Mapping[1], are two recently ratified standards by the W3C to expose relational databases to the Semantic Web. Capsenta’s Ultrawrap, a productized version of a research prototype [4], was used to convert the constitutional data into RDF. Ultrawrap supports both W3C mapping standards, and both ETL and SPARQL execution on relational data. First, the Direct Mapping created an initial default mapping represented in R2RML. Subsequently, the R2RML was edited to use terms from the Constitution Ontology. In this current phase of Constitute, Ultrawrap generates periodic dumps of the constitution data as RDF. Search Portal The Constitute search application was built using Google App Engine, Python and the RDFlib library. Free-text search is powered by indexing raw constitution text using the Google App Engine search API. Semantic search 2 http://www.capsenta.com 3 https://github.com/RDFLib Fig. 2. The Constitution Ontology is powered by the RDF triples and OWL ontology, which is stored in Google’s DataStore. Currently, Constitute implements light weight inference. For example, Figure 3 shows a user typing “Religion” and the results are suggested topics that are semantically related to the Religion, such as “Separation of church and state” which is a sub class of Religion. Search results are delivered to the front-end as JSON and browser-based computation is provided by AngularJS. Where possible HTML5 browser-based caching is used. CSS3 media queries are used to provide responsive design across desktop and mobile browsers. 4 Design Choices and Lessons Learned Initially, Constitute was planned to be developed using non semantic web technologies such as XML to encode the constitution data. The switch to Semantic Web technologies was motivated by two features: linking data and reasoning. Representing the constitution data in RDF, enables the creation of links between constitutions through shared constitutional topics that are represented by URIs. Additionally, it opens the possibility of linking the constitutional data with other datasets in the LOD cloud, such as DBpedia, New York Times, etc. The taxonomical relationship between the constitutional topics allows to preform sub class reasoning, as shown in Figure 3. Before Semantic Web technologies were used, the relationships between the constitutional topics were not formally represented in any format. After receiving a brief introduction (couple hours) on ontologies and Protege, a domain expert initiated the task of creating the Constitution Ontology. For the domain Fig. 3. Subclass reasoning in topic search: User searches for “Religion” and gets related topics which are sub class of Religion. expert, it was very intuitive to create the taxonomical relationships between the constitutional topics and import the FAO Geopolitical Ontology. The ontology consists of classes, sub classes, object properties and datatype properties. Starting from the constitutional topics, the Constitution Ontology was created in approximately 150 hours. Future work involves refactor the ontology in order to adhere to best practices. For example, we need to study if SKOS should be used instead of sub classes in OWL. Additionally, the domain expert did not reuse common vocabulary when possible

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reading in people with Down syndrome: "Visual route" or "phonological route"?

Background and Purpose: Many people with Down syndrome learn to read to some degree, but how they learn to read has been debated by researchers. Some researchers have argued that given the phonological deficits of people with Down syndrome and their stronger visual-spatial abilities, they rely on the "visual route" to learn to read, while others have shown that the "phonological route" is also ...

متن کامل

Reading in People with Down syndrome: “visual route” or “phonological route”?

Abstract Background and Purpose: Many people with Down syndrome learn to read to some degree, but how they learn to read has been debated by researchers. Some researchers have argued that given the phonological deficits of people with Down syndrome and their stronger visual-spatial abilities, they rely on the “visual route” to learn to read, while others have shown that the “phonological ro...

متن کامل

In silico veritas

Literacy underpins constitutions, civil rights and liberties. But E-mail is replacing the letter; the digital certificate, the written signature. Printing allows the recording and dissemination of observations, thoughts and ideas. But the Internet is competing with the printed page; the searchable database with the index of contents. If you are reading this article online, you may have retrieve...

متن کامل

The effect of the principle of dignity on the treatment of religious minorities in Islamic political thought Looking at the constitutions of Iran, Kuwait and Syria

According to the Islamic worldview, human beings are inherently human beings and regardless of any kind of discrimination or privilege, they have inherent value and dignity. Belief in this progressive principle determines the type of attitude and treatment of religious minorities in Islamic societies. In the present age, which is known as the age of communication, the presence of religious min...

متن کامل

A Study of Professional Ethics in the Country's Universities and its Comparison With the World's Top Universities With an Emphasis on Effective Teaching

The aim of the study is to investigate professional ethics in the country's universities and compare it with the world's top universities with an emphasis on effective teaching. The research sample is the top 100 universities in the world and 539 universities in Iran (full number). This research is qualitative and has employed the optimal examination of successful national and international exp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Web Sem.

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2014